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ABSTRACT 

The rate of convergence of the expected value of 
quantile estimates using a stochastic approximation with 
the maximum transformation is evaluated. The analysis is 
performed using linear regression techniques on computer 
simulation results for quantile estimates of the unit 
exponential distribution. Included is a discussion on the 
use of the jackknife technique to reduce the bias of the 
stochastic approximation quantile estimates. Simulation 
results for the 2-fold jackknife for the m ‘ ^ term are 
tabulated. The main conclusion of the analysis is that 
the lowest order term in the expression for the expected 
value of the estimate as a function of sample size decreases 
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I. INTRODUCTION 



The problem of estimating quantiles may be found in 
numerous statistical applications. (The a quantile 
for the distribution of the random variable S, F(s), is 
defined as = F ^ (ct) , 0 <_ a _< 1 , and is assumed to be 
unique). The increased use of computer system simulations 
provides some practical examples. In certain cases the 
quantile estimates from the generated data may be the direct 
measures used to evaluate the system. In other applications 
one might find it practical to characterize the distribu- 
tion function of variables at different points in a system 
using several statistics such as quantiles. The practica- 
bility of this latter example may be dictated by the size 
of the simulation and the number of replications performed. 

In any case, the need for efficient quantile estimates 
exists . 

Two possible solutions to the problem of quantile 
estimation are the order statistic estimate (S a (m)) and 
the stochastic approximation estimate (§ a (m)) of Robbins - 
Monro (1951) where m is the sample size. These methods 
each have their benefits and limitations which warrant 
comparison. In this regard, the increased use of computer 
simulations dictates that the comparative efficiency of 
these methods not be based solely on statistical efficiency 
but computational efficiency as well. The term computational 
efficiency refers to the time and cost of programming, 
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debugging, computing, and data storage requirements in the 
application of a given estimation scheme. 

Goodman, Lewis and Robbins (1973) used these considera- 
tions in a comparative analysis of the order statistic 
estimate and the stochastic approximation estimate with 
the maximum (minimum) transformation in estimating extreme 
quantiles (e.g., quantiles corresponding to probability 
levels a = 0.999). The maximum transformation is a device 
which they utilized to alleviate the problem of the appar- 
ently very slow convergence of the stochastic approximation 
when applied to extreme quantiles. They also indicated the 
possible existence of similar problems in highly skewed 
distributions even for moderate quantiles (e.g., a = 0.90). 

The considerations for selecting the maximum transfor- 
mation above other possible techniques are discussed by 
Goodman, Lewis, and Robbins (1973). The benefits are that 
it does not require special information about the random 
variable from which the sample is taken to estimate the 
quantile. This kind of information would be required in 
deriving bounds to the quantile estimate to alleviate the 
problem of very slow convergence. The maximum transforma- 
tion is also computationally efficient in that it requires a 
single memory cell and a limited number of binary comparisons. 
These requirements are much simpler and less time consuming 
than the scheme proposed by Kesten (1958) to alleviate 
the problem of long runs. Finally, if the maximum 
transformation to the median (a' = 0.5) is used. 
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Cochran and Davis (1965) have shown (using simulations) 
that the stochastic approximation works well for this value 
of a'. The maximum transformation is used to transform 
higher quantiles a > 0.5 to a' = 0.5 and the minimum trans- 
formation used where a < 0.5. 

The conclusions of Goodman, Lewis, and Robbins (1973) 
were that the stochastic approximation with maximum trans- 
formation provides a practical alternative to the problem 
of extreme quantile estimation. It is computationally 
efficient; computing time is linearly dependent on sample 
size (m) and storage requirements are small and fixed. 
Statistically, the estimate was relatively free of bias 
although there existed some increase in the variance of 
the estimate. Also included in their report were the 
results of the application of the jackknife for a presumed 
m term in the bias to the stochastic approximation 
scheme. Their simulation results indicated that conver- 
gence was slower than m ^ and they conjectured the possible 
existence of an m ’* ^ bias term. This implied the possibility 
of further improving the stochastic approximation with 
maximum transformation using the appropriate jackknifing 
procedure. It was this conjecture that prompted the 
analysis for this paper. 
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II. PROBLEM AND APPROACH 



The problem approached was to evaluate the rate of 
convergence of the stochastic approximation with a maximum 
transformation with the intent to further improve the 
stochastic approximation with the jackknife method of bias 
reduction proposed by Quenouille (1956). A secondary 
problem was to evaluate Turkey's assertion that the sample 
variance of the "pseudo- values " used in forming the jack- 
knifed estimate could be used to estimate the variance of 
the estimate, i.e., Var(S a (m)). 

The approach taken to solve the problem consisted of 
two basic steps. Initially, a preliminary analysis was 
made on the data obtained by Goodman, Lewis, and Robbins 
(1973) to determine if the bias was of the order 0 (m *^) 
rather than 0 (m ^) . The linear least square computer pro- 
gram of Daniel and Wood (1971) was used to accomplish this 
analysis. Once the order of the bias term was determined, 
a computer simulation was used to evaluate the effect of 
the jackknife on the rate of convergence and the variance 
of the stochastic approximation estimate with maximum 
transformation and to determine if the sample variance of 
the pseudo- values could be used to estimate the variance 
of the quantile estimate. The distributions and quantiles 
selected for analysis were the unit exponential 0.5, 0.9, 
0.95, 0.975, and 0.999 quantiles and the uniform (0,1) 0.5 
quantile. Initially, the simulation was programmed to 
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achieve an m-fold jackknifed estimate but was subsequently 
modified to the 2-fold, jackknife for reasons discussed 
later in the paper. 
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III. LINEAR REGRESSION 



The preliminary analysis in evaluating the rate of 
convergence of the stochastic approximation with maximum 
transformation was accomplished using a regression analysis 
on the original data obtained by Goodman, Lewis, and 
Robbins (1973). This data consisted of 12 points covering 
reduced sample size values from m* = 8 to 60, where 
m* = m/v and v is a constant computed for the maximum 
transformation, with corresponding values of the mean and 
variance of the stochastic approximation estimates with 
maximum transformation both with and without jackknife. 

The data used for the regression analysis was ’for the 0.999 
quantile estimates of the unit exponential distribution. 

Two equations were fitted to the data to substantiate 
the existence of a term of lower order than m In each 

case a weighted linear regression was used to account for 
the variability of the variance over the range of sample 
values. The weighting factor used was the inverse of the 
sample variance. 

The initial regression was applied based on a logarith- 
mic transformation. The regression equation used was 

ln(bias) = ln(a) - ln(m') , (3.1) 

where the values of the bias were computed as 

b « E(S a (m’)) - S a (3.2) 
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and 



(3.3) 



§ = E(S (m')) - S 

for the data without and with jackknife respectively. 

If the bias is such that b = am Y and higher order terms 
can be neglected, then the coefficients obtained from the 
regression will provide a direct estimate of y. An obvious 
problem exists if the bias contains secondary terms that 
are significant. In that case the estimate y will be incor- 
rect since the effect of secondary terms are not distinguish- 
able under the logarithmic transformation. Consequently, a 
second equation was also applied using multiple independent 
variables to check the significance of any secondary terms. 

The second regression equation used was 

bias = am ^ + bm ^ + cm ^ , (3.4) 

where y is considered a known constant to preclude the 
requirement for a non-linear regression. The inherent 
problem with applying equation 3.4 with a linear regression 
is that an estimate or guess of y is required. To achieve 
this estimate the results of the initial regression in con- 
junction with the conjecture that y = 0.5 were evaluated to 
derive an estimate for implementing the regression using 
Equation 3.4. 



10 



IV. JACKKNIFE 



Based on the assumption that the bias of the quantile 

A 

estimate S (m) is of the form 
a v 

E(§ a ) - = am Y + bm 2y + 0 (m 3y ) , (4.1) 

the jackknife technique can be applied to obtain an estimate 

~ _ y 

S (m) with a bias free of the m term: 
a v 

ECS) - S = b'm" 2Y + 0(m' 3y ) . (4.2) 

v or a ^ ^ J 



The initial approach to applying the technique was 

with the m-fold jackknife. Let S (m) be the esti.mate based 

J a v J 

/\ 

on all m sample values and S (m-1)^, i = l,...,m, b-r the 
estimate based on m-1 sample values obtained by emitting 
the ith value. Then consider the the pseudo- vaTues S a (m)^, 
i=l,...,m, computed as 

S (m) . = A*S (m) + B*S (m-1). (i=l,...,m). (4.3) 

The jackknifed estimate is then computed as the average 
of the m pseudo values: 

~ m 

S a = E(S a (m)) = SJm^/m. (4.4) 



From equations 4.1, 4.3, and 4.4, if A and B are selected 
such that 



A = 



m 



Y 



nr - (m-1) 



Y 



and B = 



(m-1) 



m Y - (m-l) Y 



(4.5) 
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then the jackknifed estimate will have a bias which is 
0 (m ^ Y ) , or 

E 4) - S a " - 7 ^ , -.T + °t“‘ 3Y 3- C4.6) 

m' (m-1) ' 

In the present case where y is assumed to be 0.5 the values 
of A and B used in the simulation were 



A = — and B = - — . (4.7) 

/in - / (m- 1) /in - /(m-1) 

The 2-fold jackknife is computed with the modification 
that the sample is partitioned into two disjoint sections 

A 

with S (m/2)., i=l,2, computed from the ith section of half 

Ct 1 

the sample values. Equations 4.3 and 4.4 then reduce to 



S a (m) 1 = A-S a (m) + B-SJm/2^ 



S (m) - = A*S (m) + B*S (m/2)- 
a v J 2 a v J a v ^2 



and 



S a " 



(S a Cm) l + S a (m) 2 5 



(4.8) 

(4.9) 

(4.10) 



There is a problem here in that it is possible to segment the 
sample into two disjoint sections in many ways; in what 
follows the sections were the first and second halves of 
the data. 

The values of A and B required to achieve the reduction 
in bias then become 

A = and B = . (4.11) 

(2 Y - 1) (2 Y - 1) 
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The resulting jackknifed estimate then is biased as 0 (m , 
or specifically 

E(§ ) - S = U Lr . (b/m 2y ) + 0(m" 3y ) . (4.12) 

a a (2 Y . 1} 

Based on the assumption y = 0 . 5 the values for A and 
B are 

A = — — and B = , (4.13) 

(/2 - 1) (/2 - 1) 

with the result that the jackknifed estimate has the form 

E(S ) = S + - 2 ) (b/m) + 0 (m" 1 * 5 ) . (4.14) 

a 01 (/2 - 1) 

These calculations indicate that if the jackknife is 
performed for the correct term, the resulting estimate 
should converge more rapidly but with a negative bias. 

A logical question that arises is what are the conse- 
quences of applying the jackknife for an erroneous term, 

- 8 

say m . Considering the true bias to be of the form in 

8 8 

equation 4.1 and applying the jackknife with A = 2 P /(2 P - 1) 

g 

and B = -1/(2 - 1) the resulting estimate will have the 

form 

~ 8 Y 

E (SJ = s a + % ' ■ - 2 ~ ) (b/m y ) + 0 (m' 2y ) . (4.15) 

a a ( 2 P - l) 

This result indicates that if the true bias is of the form 
m y such that y > 3, where 3 is the power of the term being 
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jackknifed, the resulting estimator should remain biased 
with leading term still decreasing as m Y but with a 
coefficient k = (2^ - 2 Y )/(2^ - 1) < 0. Similarly, if 
y and 3 are such that y < 3 and cy = 3 where c is some 
positive integer then 

E(S a ) = + ka/m Y + 0(m 2y ) - 0(m CY ) . (4.16) 

Consider the case where y = 0.5 and the jackknife is 
applied for the m ^ term (3=1). The resulting estimator 
will be biased and of the form 

E(S a ) = S a + .586 am‘ ,S + 0(m' 1,5 ) . (4.17) 

Thus there is a reduction in the magnitude of the m * ** 
term in the bias, but the leading term is still of the 
same order as before. 

Casual inspection of the simulation results in Goodman, 
Lewis, and Robbins (1973) indicates that the bias is decreasing 
slower than m * . If the leading term is in fact m ’ ^ , the 
effect of the jackknife is precisely as in Equation 4.17. 

The bias results obtained by Goodman, Lewis, and Robbins 
(1973) are shown in Figure 1. 

In the next sections the two types of quantile estima- 
tion are discussed in more detail. 
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V. QUANTILE ESTIMATE: ORDER STATISTIC 



The order statistic estimate S can be obtained by 
ordering the sample and using S ^ j- otm ] ^ as the order 
statistic estimate S , where [am] is the greatest integer 
less than or equal to am and is the ith ordered value. 

The distribution and density function of the ordered 
sample are: 





m ( m \ k m-k 

F s(i) (s) = . E ' k ' t 1 - F (s)] 


(5.1) 




f S(i) (s) = (i) i [F(s) ] i_1 [1-F(s)] m_1 f (s) . 


(5.2) 


For 


the uniform distribution (0,1) the mean and 


variance 


of the 


order statistic then become 






E(S (i) ) = i/(m+l) , 


(5.3) 


and 


Var(S (i) ) = i(m-i+l)/{ (m+1) (m+2) } . 


(5.4) 


I f the 


sample size m is such that am is equal to an 


integer , 


then S 

a 


= S, r and equations 5.3 and 5.4 become 

([am]) 






E(S a ) = m/(m+l) = a - [a/ (m+1)] 


(5.5) 


and 


Var(S a ) = am(m-am+l)/{ (m+1) 2 (m+2) } . 


(5.6) 



Results using equations 5.5 and 5.6 are tabulated in Table 1 
for the uniform (0,1) 0.5 quantile. 
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Cox and Lewis (1966), using direct methods, arrived 
at the following results for the unit exponential distribu- 



tion 



a 



E(S ) = -ln(l-a) - ^ (~) 
v cr K J 2m. '■I -a' 



12m 



(1-a) 

(1-a) 



and 



Var (S ) 



a 

m (1 -a) 



[— v - 1] + 0 (m 

2nT (1-a) 



These equations were used to compute the results 
Tables 2 through 6 for the unit exponential 0.5, 
0.975, and 0.999 quantiles. 



0(nf 3 ) (5.7) 

3 ). (5.8) 

in 

0.9, 0.95, 
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ORDER STATISTIC ESTIMATE: Uniform (0,1) Distribution 

(Exact Results) .500 Quantile S^ => 5=.5000 
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TABLE 



ORDER STATISTIC ESTIMATE: Unit Exponential Distribution 

(Exact Results) .500 Quantile S* = r= 0. 693147 
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ORDER STATISTIC ESTIMATE: Unit Exponential Distribution 

(Exact Results) .999 Quantile S =6 907755 
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VI • QUANTILE ESTIMATE: STOCHASTIC APPROXIMATION 



The maximum transformation was used with the stochastic 
approximation in estimating extreme quantiles to alleviate 
the problem of slow convergence, as previously discussed. 

To implement the maximum transformation consider the sample 
S^, i=l,...,m, from the distribution F(s ) =prob (S=l) from 
which the extreme quantile S , a > 0.5, is to be estimated. 

If the sample is partitioned into subsets of v values each 
(assuming m'=m/v) and the maximum value is taken from each 
of the m' subsets the result is a reduced sample S ' ^ , . . . ,S ’ , . 
The distribution of the random variable in the reduced 
sample F max ( s ) is then given by 

F (s) = F v (s) = a V = a' (6.1) 

max v v 

and consequently S ' = S . . 

1 1 a a 

This result implies that if the transformation to the 

median a' = 0.5 is used, and v is chosen such that 

v = In (a ' ) /.In (a) , then the stochastic approximation can 

be used to estimate the median quantile S ' = .5 of the 

a 

reduced sample , . . . , S^, which is equal to the extreme 
quantile S & from the original sample. 

The primary distribution used in the simulation was 

- s 

the unit exponential distribution, F(s) = 1-e . To mini- 

mize the computing time of the program the maximum transfor- 
mation was simulated by generating the reduced sample 
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directly. Since 



F max ( s ) = (l-e‘ S ) V = r 0 < r < 1, (6.2) 

solving for s gives 

s = -ln(l-r 1/v ). (6.3) 

Hence, m' pseudo- random numbers r^,...,r , were generated 
and the reduced sample S j , . . . , S ' , was computed directly 

using equation 6.3 (i.e., S| = - In (1 - r^^) , etc.). 1 This 
procedure saved considerable computing time, if v is much 
greater than 10, by reducing the number of pseudo random 
numbers required from m down to m’ = m/v. The values of 
the parameters used in the simulation are listed in the 
following table. 



a 


0.5 


0.9 


0.95 


0.975 


0.999 


V 


1 


7 


14 


27 


700 


a' 


0.5 


0.47830 


0.48768 


0.50481 


0.49641 



The stochastic approximation was then applied to the 

A 

reduced sample to obtain the estimate S a ,(m') which is the 
value obtained on the m'th iteration using the following 



*■ 

The random number generator, RANDOM, used was an 
adaptation of the generator reported in Lewis, P.A.W., 
Goodman, A.S., and Miller, J. M. , "A Pseudo- Random Number 
Generator for the System/360," IBM Systems Journal , v.8, 

No. 2, 1969 The generator was modified by G. V. Learmonth 
to produce double precision pseudo- random numbers and a shuf- 
fling scheme implemented to improve the randomness of these 
numbers. Copies of the program may be obtained from 
G. P. Learmonth. 
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equation : 



S , (i) = S , (i-1) - ^ [ 
a ' v ' a ' ^ J 1 L 



c l-sgn{S|-S a , (i-1) } 



- «'] , (6.4) 



where the function sgn(x) assumes the value 1 if x > 0 and 
0 if x < 0, C is the inverse of the density f (s^,) , and 

A 

S a ,(0) is an arbitrary starting value. 

The initial values of C and , ( 0 ) were obtained using 

the first three sample values Sj, Si,, and from the 

reduced sample. These values were ordered to get S (l)’ 

S ^ » an d The following approximations were then 

used for C and S . (0) . 

a ’ 



c . iiiiii) 



S (lP ^ S (3) ~ S (2)' ) ] 



^ S '(2)'- *(1)^ CS (3) ‘ S U) JJ 



and S a ,(0) = S' (2) . (6.5) 
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VII. RESULTS AND CONCLUSIONS 



The results of the linear regression on the 0.999 
quantile estimates for the unit exponential distribution 
without jackknife in Goodman, Lewis, and Robbins (1973) 
provided some evidence that the term of lower order than 
m 1 did exist. The results using equation 3.1 were 

bias = 20.4m"' 697 (7.1) 

and from equation 3.4 the results were 

bias = 2.328m"' 5 + 93.67m" 1 . (7.2) 

From equation 7.2 it was apparent that the m ’ 5 term was 
significant in the bias of the estimate even though the 
value y = 0.5 was based on a conjecture only partially 
substantiated by the results in equation 7.1. The result 
that ? = 0.697 can be partially accounted for by the fact 
that the m 1 term is still a significant factor at the 
sample values considered. The results of the linear 
regression on the jackknifed data in Goodman, Lewis, and 
Robbins (1973) did not provide any additional insight 
into the rate of convergence of the quantile estimate. 

Although these results were not conclusive in 
establishing the exact order of the bias, they did tend 
to support the conjecture that the higher order term was 
m"’ 5 . It was therefore decided to extend m' out to a value 
of 200 and jackknife for am' 5 term. 
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The simulation was initially programmed to achieve an 
m-fold jackknife. However, results of the simulation were 
that the variance of the estimate tended to increase above 
a practical level. The variance of the jackknifed estimate 
for the 0.999 quantile of the unit exponential distribution 
was as much as 100 times higher than that of the stochastic 
approximation without the jackknife. These results are 
listed in Table 7 and may be compared with the final 
results of the 2-fold jackknife listed in Table 19 for the 
same quantile. It was thus concluded that without some 
kind of data transformation the m-fold jackknife was 
impractical. Consequently, the program was modified to 
investigate the 2-fold jackknife. 

The variance reduction technique using control variables 
as discussed by Gaver (1969) was initially programmed into 
the simulation in an effort to reduce the run time required 
to achieve a reasonable value for the variance of the 
estimate. The control variable used was the number of 
sample values from the reduced sample that exceeded 
divided by m' the total number of sample values. The 
result was that a reduction of approximately 20 per cent 
for the variance of the estimate of the expected value of 
the quantile estimate without the jackknife was achieved. 
However, the effect on the variance of the estimate of the 
expected value of the jackknifed estimate was negligible. 

Converting the program from the m-fold to the 2-fold 
jackknife precluded the evaluation of Tukey's idea of 
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UNIT EXPONENTIAL .999 QUANTILE: Se< = 999 = 6 . 907755 

STOCHASTIC APPROXIMATION WITH MAXIMUM TRANSFORMATION V=700 
m-FOLD JACKKNIFE FOR l/(m**.5) TERM 30,000 REPLICATIONS 

WITHOUT JACKKNIFE S„ WITH JACKKNIFE S^ 





Cn 


LO 


o- 


to 


CO 


NO 


NO 




cn 


NO 


0 


CM 


cn 


cn 


LO 


cn 




rH 


f — \ 


0- 


rH 


0 




LO 


o- 


CM 




0 


CM 


NO 


O- 


to 


cn 


03 


rH 


LO 


O 


V 


CM 


cn 


cn 


rH 


LO 


0 


NO 


0 


00 


LO 


VO 


CM 


CM 


00 


NO 


rH 


O 


NO 




0 


O- 




NO 


rH 


NO 


^r 


0- 


NO 


NO 


NO 


03 


0 


to 


03 


to 


CO 


rH 


v ■> 


01 


cn 


^3“ 


to 


O 


cn 


CM 


CM 


to 


OO 


to 


to 


cn 


rH 


"H" 


cn 


M" 




& 


0 


to 


NO 


to 


cn 


to 


rH 




0- 




03 


to 


NO 


NO 


O 


rH 


LO 


^r 


< 


NO 


r— i 


0 


OO 


OO 




O 


00 


0 


O 




OO 


03 


O 


cn 


LO 




to 


> 


00 


OO 


03 


O- 


'Hj- 


to 


03 


0 


0 


cn 


cn 


CO 


OO 


OO 


0 


0 


0- 


0- 




cn 


cn 


cn 


00 


OO 


OO 


OO 


00 


00 


o- 


0- 


O- 


o- 


O- 


0- 


O- 


0- 


0- 





o- 


O- 


NO 


rH 


LO 


OO 


0- 


^3- 


to 


00 


NO 


’H' 


o- 


cn 


IO 


03 


to 


0 




0 


O 


O- 


rH 


rH 


CM 


to 


cn 


NO 


NO 


to 


IO- 


rH 


CO 


OO 


NO 


rj- 


HT 


r — > 


to 


to 


to 


O- 


O 


CO 


0 


NO 


0 


to 


NO 


rH 


OO 


rH 


CM 


rH 


03 


LO 




rH 


0- 


CM 


to 


M" 


cn 


CM 


00 


rH 


cn 


0 


r—i 




O 


cn 


"H" 


^r 


to 


\\in 


cn 


03 


NO 


O 


03 


to 


O 


cn 


*H" 


LO 


^ 3 " 


to 


10- 


03 


0- 


cn 


NO 


0 


v — / 


NO 


^ 3 “ 


rH 


to 


^3" 


to 


O 


0 


"H" 


LO 


^3- 


O 


(03 


cn 


0 


0 


to 


0 


w 




NO 


NO 


NO 


CO 


cn 


cn 


lo- 


io- 


LO 


LO 


O- 


00 


o- 


cn 


cn 


cn 


CO 




00 


OO 


CO 


CO 


00 


00 


CO 


co 


oo 


CO 


CO 


OO 


00 


00 


00 


CO 


00 


00 




NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 







cn 


rH 


rH 


•M" 


O- 




LO 


NO 


OO 


to 


to 


cn 


to 


rH 


NO 


00 


CM 


/'—'V 


to 


NO 


OO 


O- 


03 


LO 


rH 


O 


to 


NO 


LO 


to 


Oj* 


^r 


rH 


O 


LO 


NO 




OO 


rH 


03 


03 


LO 


t — ( 


O' 


Hj" 


rH 


to 


LO 


0- 


cn 


to 


cn 


OO 


cn 


rH 


SCO 


O- 


CO 


rH 


LO 


NO 


to 


LO 


cn 


LO 


0 


O- 


H 




03 


NO 


NO 


03 


LO 


N—/ 


LO 


O 


O 


O 


^3- 


LO 


to 


03 


LO 


0- 


rH 


to 


03 


LO 


'^3- 


O- 


to 


rH 


& 


LO 


CM 


03 


CO 


O 


03 


CO 


OO 


O 


to 


OO 


to 


cn 


LO 


03 


cn 


0- 


LO 


< 


NO 


CM 


to 


CO 


NO 


^3~ 


03 


rH 


rH 


0 


cn 


cn oc 


OO 


OO 


0- 


0- 


to- 


> 


LO 


to 


03 


rH 


rH 


rH 


rH 


rH 


rH 


rH 


0 


0 


0 


O 


O 


0 


0 


co 




O 


O 


O 


O 


O 


O 


O 


O 


O 


O 


0 


0 


0 


O 


O 


0 


0 


0 





cn 


NO 


NO 


to 




O- 


0 


0 




’H- 


LO 


CM 


cn 


CO 


to 


^r 


rH 


to 




00 


03 


0- 


0 


cn 


OO 


cn 


to 


rH 






cn 


NO 


0 


CO 


03 


to 


NO 


/ — 


LO 


rH 


^3- 


LO 




OO 


0- 


cn 


03 


"H- 


CO 


CO 


rH 


cn 




CO 


cn 


LO 


r 


O 


NO 


LO 


00 


O 


to 


rH 


to 


CM 


O 


^ 3 “ 


CO 


CO 


0 


CO 


to 


NO 


NO 




'OO 


LO 


CM 


0- 


CM 


LO 


^3" 


cn 


OO 


O- 


O- 


to 


NO 


CO 


'T 


CO 


LO 


cn 


v / 


0 


O- 


cn 


03 


cn 




LO 


03 


03 


cn 


O- 


0^ 




NO 


NO 


LO 


LO 




w 


NO 


^3" 


to 


to 


03 


CM 


03 


CM 


O 


rH 


rH 


rH 


rH 


rH 


rH 


rH 


rH 


rH 




cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 


cn 




NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 


NO 



0 


O 


O 


O 


O 


O 


O 


O 


O 


O 


O 


0 


0 


0 


0 


0 


0 


O 


0 


O 


O 


O 


O 


O 


O 


O 


O 


O 


O 


0 


0 


0 


0 


0 


0 


O 


CO 


NO 


M" 


CM 


O 


CO 


NO 




03 


O 


CO 


NO 


^3- 


CM 


0 


CO 


NO 


M" 


03 


LO 


CO 


rH 


M" 


NO 


cn 


03 


LO 


CO 


0 


to 


NO 


cn 


03 




O- 


O 








rH 


rH 


rH 


rH 


CM 


03 


CM 


to 


to 


to 


to 






* 3 - 


LO 



O- 



29 



TABLE 



estimating the variance of the quantile estimate using the 
variance of the pseudo- values . This assertion is based on 
the premise that the distribution of the pseudo- values 
tends to normality and that the pseudo-values are approxi- 
mately uncorrelated. However, results from the data 
obtained by Goodman, Lewis, and Robbins (1973) for the 
coefficients of skewness and excess of the estimates obtained 
via a stochastic approximation tend to negate the assumption 
of a normal distribution. These results are plotted in 
Figures 2 and 3. If, as the theory of stochastic approxi- 
mation asserts, the asymptotic distribution is normal, the 
convergence is very slow. It would therefore appear that 
using the pseudo-values to estimate the variance would not 
be fruitful in the present application of the jackknife 
without some kind of normalizing transformation. It may 
be that the basic problem is the correlation between pseudo- 
values, not the lack of normality; but this has not been 
investigated. 

The final simulation was programmed to compute the 
stochastic approximation with maximum transformation with 
and without a 2-fold jackknife for the m * ^ term. The 
results for the unit exponential 0.5, 0.9, 0.95, 0.975, and 
0.999 quantile and the 0.5 quantile of the uniform (0,1) 
distribution are listed in Tables 8 through 19 with plots 
of the estimated bias versus sample size. The remaining 
evaluation is based on the results obtained for the unit 
exponential distribution. 
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UNIFORM (0,1) .500 QUANTILE: So< = 5 =. 500 

STOCHASTIC APPROXIMATION WITH MAXIMUM TRANSFORMATION V=1 

WITHOUT JACKKNIFE 90,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 
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Quantities in brackets are estimates of the standard deviations of the estimates. 
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Quantities in brackets are estimates of the standard deviations of the estimates. 
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Quantities in brackets are estimates of the standard deviations of the estimates. 
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SAMP I. H SIZE 



UNIT EXPONENTIAL .900 QUANTILE: 5*= g = 2. 302585 

STOCHASTIC APPROXIMATION WITH MAXIMUM ‘ TRANSFORMATION V=7 
WITHOUT JACKKNIFE 60,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 

TABLE 12 



UNIT EXPONENTIAL .900 QUANTILE: S« = 9 =2. 302585 

STOCHASTIC APPROXIMATION WITH MAXIMUM 'TRANSFORMATION V=7 
WITH JACKKNIFE m'* 5 TERM 60,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 

TABLE 13 
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SAMPLE SIZE 



UNIT EXPONENTIAL .950 QUANTILE: S«„ 95 =2. 995732 

STOCHASTIC APPROXIMATION WITH MAXIMUM 'TRANSFORMATION V=14 
WITHOUT JACKKNIFE 60,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 



UNIT EXPONENTIAL .950 QUANTILE: S*_ 95=2.995732 

STOCHASTIC APPROXIMATION WITH MAXIMUM’ TRANSFORMATION V=14 
WITH JACKKNIFE m" • 5 TERM 60,000 REPLICATIONS 
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Quantities in bracket — ~'' j; mates of the standard deviations of the estimates. 



Unit Exponential .95 Quantile S gr=2. 995732 
Stochastic Approximation With: 

Maximum Transformation V=14 

.03 • 2-Fold Jackknife l/(m**.5) Term 

60,000 Replications 
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SAMPLE SIZE 



UNIT EXPONENTIAL .975 QUANTILE: S* = 97S =3. 688879 

STOCHASTIC APPROXIMATION WITH MAXIMUM' TRANSFORMATION V=27 
WITHOUT JACKKNIFE 60,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 



UNIT EXPONENTIAL .975 QUANTILE: S« = 975 = 3 . 688879 

STOCHASTIC APPROXIMATION WITH MAXIMUM ‘TRANSFORMATION V=27 
WITH JACKKNIFE m" • 5 TERM 60,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 
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SAMPLE SIZE 



UNIT EXPONENTIAL .999 QUANTILE: S<* = 999 = 6 . 907755 

STOCHASTIC APPROXIMATION WITH MAXIMUM ‘TRANSFORMATION V=700 
WITHOUT JACKKNIFE 60,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 

TABLE 18 



UNIT EXPONENTIAL .999 QUANTILE: S*= 999=6.907755 

STOCHASTIC APPROXIMATION WITH MAXIMUM TRANSFORMATION V=700 
WITH JACKKNIFE m“.5 TERM 60,000 REPLICATIONS 
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Quantities in brackets are estimates of the standard deviations of the estimates. 
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5600.0 39200.0 72800.0 106400.0 140000. 



In each case the jackknife estimate was negatively 
biased as predicted from equation 4.14. However, evalua- 
tion of the 0.5 quantile data (Table 11) indicates that the 

_ 28 9 

resulting rate of convergence to be on the order of m 

Further, successive regressions on the points in the tail 

of the curve indicate that the value of y is decreasing, 

possibly to y = 0.25. Also, the sample bias obtained for 

the 0.5 quantile goes negative at a sample size value 

between 192 and 200 which negates the assumption that the 

coefficients in equation 3.4 are positive. 

The results support the conjecture that a m ’ 5 term 

was significant in the bias. However, it also appears that 

there exists a lower order term, probably of the order 
- 25 

m * . This would imply that the bias may be of the form 

bias = am'- 25 + bm' * 5 + cm'* 75 + dm' 1 + 0(m' 1,25 ). (7.3) 

A weighted linear regression using this equation was 
performed on the data for the unit exponential a = 0.5 
quantile without jackknife listed in Table 10. The resulting 

A 

estimates for the leading four terms were a = -0.0501824, 

b = 0.0251160, c = 0.790984, and d = -0.667208. The values 

of the observed bias, bias estimated from the equation fit 

by the weighted linear regression and the residuals are 

listed in Table 20. The equation fit achieved a multiple 

correlation coefficient squared of .9999. Note that the 

simulation for this term is very large (390,000 replications) 

2 

so that such a large value of R is not surprising. 
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Results of the weighted linear regression on the estimated bias of the 
stochastic approximation with maximum transformation without jackknife 
for the .500 quantile of the unit exponential distribution. 
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TABLE 20 



If the conjecture that the bias is of the form in 7.3 
is true, the bias computed as a result of the jackknife 
for the m * 5 term would be 

bias = + . 544am" * 25 - .648cm'* 75 - 1 . 41 dm" 1 + OCm' 1 * 25 ^, (7.4) 

which' is consistent with the results obtained. 

A jackknife for the m * term would result in a bias 
of the form 

bias = - 1 . 19bm’ " 5 - 2.609cm'* 75 - 4.29dm" 1 + OCm' 1 * 25 ), (7.5) 

which would indicate that the magnitude of the resulting 

bias may be a large negative value and converge as m * 5 . 

- 25 

In fact, a pilot run to jackknife the m * term gave 
results with a large negative bias and converging approxi- 
mately as m * 5 . These results are plotted in Figure 10. 

The pilot run consisted of 120,000 replications for the 
a = .5 quantile of the unit exponential distribution. 

An informal analysis of these results indicated that 
the rate of convergence was on the approximate order of 
m * 5 . These results appear to conform fairly well with the 
results predictable from equation 7.5. It would therefore 
appear that the form of the bias of the stochastic approxi- 
mation with maximum transformation may be of the form in 
equation 7.3. 

If the form of the bias is in fact of the form in 7.3, 
a possible approach would be to jackknife for both the 
m * and m * terms. This could be accomplished by using 
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Unit Exponential .500 Quantile S =0.693147 
Stochastic Approximation With: 

Maximum Transformation V=1 
2-Fold Jackknife l/(m**.25) Term 
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SAMPLE SIZE 



as pseudo- values 



S . . = A*S (m) + B*S (m/2). + C*S (m/4). 
ij a v ' a v ' 1 a J 



i,j=l,2 (7.6) 



where (m) and S a (m/2)^ are as previously discussed and 
S (m/4) . . is computed by partitioning the ith subset of 

Ot 1 J 

m/2 values into two disjoint subsets of m/4 values each. 

Then S (m/4) . . is the estimate based on the values in the 
ot 1 3 

jth subset from the ith half sample set. 

The jackknifed estimate then becomes 



S = 
a 



^ S ll + S 1 2 + S 21 + S 22^ 



(7.7) 



with the result that if A, B, and C are chosen as 



A = 1-B-C, (7.8) 

B = - (1 + C /2-l)/(2'* 25 - 1), (7.9) 

C = (/2-2'* 25 )/[2'* 25 - 1 - (/2-1) 2 ], (7.10) 

then 

E (S ) - S a = -1.68cm'* 75 + 6.06dm' 1 + 0(m' 1 * 25 ), (7.11) 

- 25 - 5 

which is void of the m * and m * bias terms. This may, 
of course, inflate the variance of the estimator to an 
unacceptable level. 

In conclusion, an assessment of the utility of the method 
for obtaining an extreme quantile discussed in this thesis 
can be made by referring to Figures 11 and 12. These show 
the standard deviations and root mean square errors of the 
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order statistic estimate, stochastic approximation estimate 
with maximum transformation (with and without jackknife) as 
a function of the sample size. 

There is clearly an increase in the standard deviation 
of the quantile estimate, for all m, as one goes from the 
estimate based on order statistics to the stochastic approxi- 
mation estimate and then to the stochastic approximation 
with jackknife. However, this is more than offset by the 
(fixed) smaller memory requirements for the stochastic 
approximation estimate and the fact that the time needed 
to obtain an estimate from a sample of size m is propor- 
tional to m, while the order statistic estimate requires 
memory proportional to m and time (to order the data) 
proportional to mlog(m). These factors become even more 
critical when considered in the context of estimating 
large numbers of quantiles or in cases where m is large 
(see Goodman, Lewis, Robbins, 1973). 
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